To provide users with the ability of interactively exploring a 3D model of the island of Sardinia, we have developed a system that allows to transform the elevation data and satellite images used in traditional cartography into a multiresolution VRML model that can be viewed on a graphics workstation. To make possible exploration at interactive speeds of very large models, we have developed a special high-speed 3D Web browser that incorporates sophisticated time-critical rendering techniques.
Several steps have to be performed to obtain a 3D model of the Sardinia island. A first step involves the registration of the data with respect to a geographical system, taking into account the satellite viewing direction and the earth sphericity. A second step has to be performed to normalize values coming from different areas in order to have the same distribution on each corresponding bands. This problem arises because of different meteorological conditions during satellite image captures. Another step of the process involves the merging of different areas to obtain a single image. In our case we merged four different satellite images to cover the whole Sardinia island. The final step is the composition of images corresponding to different wave lengths to obtain the desired image. In our case we selected the blue, green and red wave length to have a realistic representation of the Sardinia, like a color picture taken from the space. The last operation is the color enhancement of the resulting image to correct unwanted deviation from visually realistic colors, mainly due to athmospheric effects.
The complexity of the terrain model represented as a regular grid is far too high to allow an interactive exploration even on high-end workstations. To be able to obtain interactive speeds during the terrain visualization, we have to optimize the model so as to allow the renderer to trade rendering quality with speed. All the optimizations are performed automatically by a software tool that we created that takes as input elevation data, satellite image, and desired quality parameters for the output. The original terrain data (a single 359x681 regular grid containing the elevation data each 400 m) is subdivided in 16x10 quadrants (that can then be culled independently). Each of the 160 quadrants is then represented at 3 levels of detail by producing, from the original regular sub-grid, a series of irregular triangular meshes with a decimation algorithm that iteratively removes vertices from planar regions of the mesh [7]. At the same time, images are scaled to fit into texture memory. Different planarity tolerances, adaptively adjusted during decimation, are used to obtain the desired 3 levels of details. Cracks between adjacent quadrants are avoided by using fixed low tolerances at the borders. The appropriate level of detail for each of the quadrants can then be chosen by the renderer at run time. Currently, the maximum complexity of the model after optimization is of 108617 triangles, while the minimum model complexity is of 41987 triangles.
To allow users to view high-resolution versions of a region, we add hyperlinks from each of the quadrants to a high-resolution version of the region surrounding it, so that the interactive selection of a quadrant will trigger the loading and display of its high-resolution version. Since the high-resolution regions are smaller that the entire model and have access to the same hardware resources of the entire model when renderer (rendering speed, texture memory), more details can be included. These high-resolution versions are constructed automatically by our software tool by recursively applying to regions surrounding each quadrant the same algorithm we apply to the entire model.
Visual annotations on the 3D model are placed at interesting sites under the form of 3D objects with an associated URL. The interactive selection of one of these markers during navigation will be translated by the browser in a request to fetch and view the associated descriptive document, that can be in the form of text, still images, animations or even other 3D models.
Figure 2. Annotations
i3D's device configuration uses a Spaceball and a mouse as input devices. The Spaceball is used for the continuous specification of the camera's position and orientation using an eye-in-hand metaphor [8], while the mouse is used to select objects and access media documents. Both user's hands can therefore be employed simultaneously to input information. Keyboard commands are used to control various visibility flags and rendering modes. The ability to continuously specify complex camera motions in an intuitive way together with high visual feedback rates provides an accurate simulation of motion parallax, one of the most important depth cues when dealing with large environments [3][4].
During navigation, the i3D's time-critical rendering engine is activated at regular intervals by the main i3D event loop and is requested to refresh the screen while adhering to the user-specified timing constraint. At each activation, the rendering engine renders a single frame by executing a well defined sequence of operations.
First, the database is traversed and the objects visible from the observer's viewpoint are identified by hierarchically determining the visibility of portions of the scene through a traversal of an octree spatial subdivision. Each of the objects identified is then compiled into a graphical description by stripping off its appearance attributes and compiling them into a device-dependent sequence of drawing commands. During this conversion, geometries are optimized to reduce their rendering time; in particular, structured triangular meshes are generated from the triangle lists stored in the database. To avoid recreating compiled versions at each frame, as it is done in systems like Performer [6], the graphical descriptions generated for each database object are cached and reused while still valid.
To control the number of polygons rendered at each frame, so as to be able to meet the timing requirements, the time-critical rendering engine traverses the generated display list and selects the level of detail at which each of the objects will be represented. Level of detail selection is based on the importance of each object for the current frame, which is determined by computing an approximation of its size projected on the screen, and on feedback regarding the time required to render previous frames. The feedback algorithm is similar to the one used by Performer [6]. Update rates are associated to the different objects in the database to avoid recomputing their projection on the screen and their compiled versions at each frame.
Once the levels of details are selected, the display list is sorted to maximize rendering state coherence and rendered by executing each of the compiled command sequences for the selected level of detail of each of the objects. Rendering statistics for the current frame are updated and stored so as to be used when selecting the level of detail selection for the next frame.
Thanks to these rendering optimizations and to the preprocessing done on the data, the textured 3D model of the Sardinia island can be interactively explored at more than 10 frames/second on a Silicon Graphics Onyx RE2.
In the context of the Virtual Sardinia project we use Web-live to offer users the possibility to select and view video-tapes showing aspects of the island. For example, video-tapes showing life scenes of a particular region can be viewed by selecting particular markers on the terrain during 3D navigation. Another important usage of Web-live is to allow users that do not have graphics workstation to view pre-recorded flights over the 3D-model. In this case, Web-live is used to distribute over the network 30 minutes of live-video taken during a navigation within the model and while interacting with it, showing the capabilities of the browser, the model itself and informations hyperlinked to it, bringing the remote visitor into a virtual visit of Sardinia.
Figure 3. Virtual Sardinia explored through Web Live
The basic technique used with Web Live is the Server Push feature introduced by Netscape Communications, with release 1.1 of their WWW Netscape Navigator browser (also available in other browsers, for instance the VOLBrowser of Video On Line).
With this mechanism the server sends down a chunk of data. The browser display them leaving the connection open to receive more data for a fixed time or until the client interrupts the connection.
The MIME type used for the server push is called
"multipart/x-mixed-replace". The "replace" indicates that each new
data block will cause the previous data block to be replaced -- that
is, new data will be displayed instead of (not in addition to) old
data. A simple description of a CGI program that uses this technique is
described below.
print "Content-type: multipart/x-mixed-replace;boundary=---ThisRandomString---" print "---ThisRandomString---" while true { print "Content-type: image/jpg" print <image> print "---ThisRandomString---" }
Figure 4. Web live functionality map
Web Live performs real-time image grabbing from an analog or digital video source (camera, VCR, TV tuner, Laser Disc, ...). The remote user interacts with Web Live trough a CGI program that acts as a client interface. The user can select frame size, image quality and frame rate of the sequence of images and then watch the real-time grabbed video frames. The Web-live approach offers two main advantages with respect to the distribution of precomputed movies: first, since frames are displayed by the client as soon as they arrive, the latency is drastically reduced at the beginning of the movie; second, since viewing parameters are determined by the client, the user can configure them to perfectly fit her needs, and does not depend on choices made at the server side as with precomputed movies.
Starting from a movie, a number of frames are grabbed at the desired rate and stored on disk at half video size in compressed format, with timing informations coded in the file name. From them, quarter-size and eighth-size video frames are also computed. Once an entire sequence is acquired, the system is able to show to the user the entire grabbed movie in a single view, by putting eight-sized frames in a parametric NxM table from first to last image with the appropriate time step to fill the table. From here the user can easily browse the movie. Allowed operation are:
In the Virtual Sardinia context, Web Show has been used to allow the browsing of the previously mentioned 30 minutes tape, from which we extracted 1389 frames, at the approssimate rate of 0.5 frame per second.
Figure 5. Virtual Sardinia explored through Web Show
This is another way to view remotely the precomputed exploration of the model, with the advantage that the user can easily identify sequences of the video with its corresponding geographical location. In combination with the interactive 3D navigation, this facility can also be used to preview have a guided tour of a certain area before interactively exploring it.
Figure 6. Clickable maps